NSF PAR Search | NSF Public Access Repository

Can ChatGPT Understand Causal Language in Science Claims?

https://doi.org/10.18653/v1/2023.wassa-1.33

Kim, Yuheun; Guo, Lu; Yu, Bei; Li, Yingya (July 2023, Association for Computational Linguistics)

This study evaluated ChatGPT’s ability to understand causal language in science papers and news by testing its accuracy in a task of labeling the strength of a claim as causal, conditional causal, correlational, or no relationship. The results show that ChatGPT is still behind the existing fine-tuned BERT models by a large margin. ChatGPT also had difficulty understanding conditional causal claims mitigated by hedges. However, its weakness may be utilized to improve the clarity of human annotation guideline. Chain-of-thought prompting was faithful and helpful for improving prompt performance, but finding the optimal prompt is difficult with inconsistent results and the lack of effective method to establish cause-effect between prompts and outcomes, suggesting caution when generalizing prompt engineering results across tasks or models.

Full Text Available

Search for: All records